Overview

Dataset statistics

Number of variables18
Number of observations148670
Missing cells116710
Missing cells (%)4.4%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory20.4 MiB
Average record size in memory144.0 B

Variable types

Categorical10
Numeric8

Alerts

Dataset has 1 (< 0.1%) duplicate rowsDuplicates
construction_type is highly imbalanced (99.7%)Imbalance
rate_of_interest has 36439 (24.5%) missing valuesMissing
upfront_charges has 39642 (26.7%) missing valuesMissing
property_value has 15098 (10.2%) missing valuesMissing
income has 9150 (6.2%) missing valuesMissing
ltv has 15098 (10.2%) missing valuesMissing
ltv is highly skewed (γ1 = 120.6153375)Skewed
upfront_charges has 20770 (14.0%) zerosZeros

Reproduction

Analysis started2024-06-08 23:08:05.829806
Analysis finished2024-06-08 23:08:56.450696
Duration50.62 seconds
Software versionydata-profiling v4.8.3
Download configurationconfig.json

Variables

gender
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
Male
42346 
Joint
41399 
Sex Not Available
37659 
Female
27266 

Length

Max length17
Median length6
Mean length7.9382391
Min length4

Characters and Unicode

Total characters1180178
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSex Not Available
2nd rowMale
3rd rowMale
4th rowMale
5th rowJoint

Common Values

ValueCountFrequency (%)
Male 42346
28.5%
Joint 41399
27.8%
Sex Not Available 37659
25.3%
Female 27266
18.3%

Length

2024-06-08T20:08:56.811615image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-06-08T20:08:57.289964image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
ValueCountFrequency (%)
male 42346
18.9%
joint 41399
18.5%
sex 37659
16.8%
not 37659
16.8%
available 37659
16.8%
female 27266
12.2%

Most occurring characters

ValueCountFrequency (%)
e 172196
14.6%
l 144930
12.3%
a 144930
12.3%
o 79058
 
6.7%
i 79058
 
6.7%
t 79058
 
6.7%
75318
 
6.4%
M 42346
 
3.6%
J 41399
 
3.5%
n 41399
 
3.5%
Other values (8) 280486
23.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 880872
74.6%
Uppercase Letter 223988
 
19.0%
Space Separator 75318
 
6.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 172196
19.5%
l 144930
16.5%
a 144930
16.5%
o 79058
9.0%
i 79058
9.0%
t 79058
9.0%
n 41399
 
4.7%
b 37659
 
4.3%
v 37659
 
4.3%
x 37659
 
4.3%
Uppercase Letter
ValueCountFrequency (%)
M 42346
18.9%
J 41399
18.5%
A 37659
16.8%
S 37659
16.8%
N 37659
16.8%
F 27266
12.2%
Space Separator
ValueCountFrequency (%)
75318
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1104860
93.6%
Common 75318
 
6.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 172196
15.6%
l 144930
13.1%
a 144930
13.1%
o 79058
 
7.2%
i 79058
 
7.2%
t 79058
 
7.2%
M 42346
 
3.8%
J 41399
 
3.7%
n 41399
 
3.7%
A 37659
 
3.4%
Other values (7) 242827
22.0%
Common
ValueCountFrequency (%)
75318
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1180178
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 172196
14.6%
l 144930
12.3%
a 144930
12.3%
o 79058
 
6.7%
i 79058
 
6.7%
t 79058
 
6.7%
75318
 
6.4%
M 42346
 
3.6%
J 41399
 
3.5%
n 41399
 
3.5%
Other values (8) 280486
23.8%

approv_in_adv
Categorical

Distinct2
Distinct (%)< 0.1%
Missing908
Missing (%)0.6%
Memory size1.1 MiB
nopre
124621 
pre
23141 

Length

Max length5
Median length5
Mean length4.6867801
Min length3

Characters and Unicode

Total characters692528
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rownopre
2nd rownopre
3rd rowpre
4th rownopre
5th rowpre

Common Values

ValueCountFrequency (%)
nopre 124621
83.8%
pre 23141
 
15.6%
(Missing) 908
 
0.6%

Length

2024-06-08T20:08:57.841760image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-06-08T20:08:58.285520image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
ValueCountFrequency (%)
nopre 124621
84.3%
pre 23141
 
15.7%

Most occurring characters

ValueCountFrequency (%)
p 147762
21.3%
r 147762
21.3%
e 147762
21.3%
n 124621
18.0%
o 124621
18.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 692528
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
p 147762
21.3%
r 147762
21.3%
e 147762
21.3%
n 124621
18.0%
o 124621
18.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 692528
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
p 147762
21.3%
r 147762
21.3%
e 147762
21.3%
n 124621
18.0%
o 124621
18.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 692528
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
p 147762
21.3%
r 147762
21.3%
e 147762
21.3%
n 124621
18.0%
o 124621
18.0%

loan_type
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
type1
113173 
type2
20762 
type3
14735 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters743350
Distinct characters7
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowtype1
2nd rowtype2
3rd rowtype1
4th rowtype1
5th rowtype1

Common Values

ValueCountFrequency (%)
type1 113173
76.1%
type2 20762
 
14.0%
type3 14735
 
9.9%

Length

2024-06-08T20:08:58.745502image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-06-08T20:08:59.160796image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
ValueCountFrequency (%)
type1 113173
76.1%
type2 20762
 
14.0%
type3 14735
 
9.9%

Most occurring characters

ValueCountFrequency (%)
t 148670
20.0%
y 148670
20.0%
p 148670
20.0%
e 148670
20.0%
1 113173
15.2%
2 20762
 
2.8%
3 14735
 
2.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 594680
80.0%
Decimal Number 148670
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 148670
25.0%
y 148670
25.0%
p 148670
25.0%
e 148670
25.0%
Decimal Number
ValueCountFrequency (%)
1 113173
76.1%
2 20762
 
14.0%
3 14735
 
9.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 594680
80.0%
Common 148670
 
20.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 148670
25.0%
y 148670
25.0%
p 148670
25.0%
e 148670
25.0%
Common
ValueCountFrequency (%)
1 113173
76.1%
2 20762
 
14.0%
3 14735
 
9.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 743350
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 148670
20.0%
y 148670
20.0%
p 148670
20.0%
e 148670
20.0%
1 113173
15.2%
2 20762
 
2.8%
3 14735
 
2.0%

loan_purpose
Categorical

Distinct4
Distinct (%)< 0.1%
Missing134
Missing (%)0.1%
Memory size1.1 MiB
p3
55934 
p4
54799 
p1
34529 
p2
 
3274

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters297072
Distinct characters5
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowp1
2nd rowp1
3rd rowp1
4th rowp4
5th rowp1

Common Values

ValueCountFrequency (%)
p3 55934
37.6%
p4 54799
36.9%
p1 34529
23.2%
p2 3274
 
2.2%
(Missing) 134
 
0.1%

Length

2024-06-08T20:08:59.643818image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-06-08T20:09:00.064829image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
ValueCountFrequency (%)
p3 55934
37.7%
p4 54799
36.9%
p1 34529
23.2%
p2 3274
 
2.2%

Most occurring characters

ValueCountFrequency (%)
p 148536
50.0%
3 55934
 
18.8%
4 54799
 
18.4%
1 34529
 
11.6%
2 3274
 
1.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 148536
50.0%
Decimal Number 148536
50.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 55934
37.7%
4 54799
36.9%
1 34529
23.2%
2 3274
 
2.2%
Lowercase Letter
ValueCountFrequency (%)
p 148536
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 148536
50.0%
Common 148536
50.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 55934
37.7%
4 54799
36.9%
1 34529
23.2%
2 3274
 
2.2%
Latin
ValueCountFrequency (%)
p 148536
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 297072
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
p 148536
50.0%
3 55934
 
18.8%
4 54799
 
18.4%
1 34529
 
11.6%
2 3274
 
1.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
nob/c
127908 
b/c
20762 

Length

Max length5
Median length5
Mean length4.7206968
Min length3

Characters and Unicode

Total characters701826
Distinct characters5
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rownob/c
2nd rowb/c
3rd rownob/c
4th rownob/c
5th rownob/c

Common Values

ValueCountFrequency (%)
nob/c 127908
86.0%
b/c 20762
 
14.0%

Length

2024-06-08T20:09:00.597989image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-06-08T20:09:01.039467image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
ValueCountFrequency (%)
nob/c 127908
86.0%
b/c 20762
 
14.0%

Most occurring characters

ValueCountFrequency (%)
b 148670
21.2%
/ 148670
21.2%
c 148670
21.2%
n 127908
18.2%
o 127908
18.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 553156
78.8%
Other Punctuation 148670
 
21.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
b 148670
26.9%
c 148670
26.9%
n 127908
23.1%
o 127908
23.1%
Other Punctuation
ValueCountFrequency (%)
/ 148670
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 553156
78.8%
Common 148670
 
21.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
b 148670
26.9%
c 148670
26.9%
n 127908
23.1%
o 127908
23.1%
Common
ValueCountFrequency (%)
/ 148670
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 701826
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
b 148670
21.2%
/ 148670
21.2%
c 148670
21.2%
n 127908
18.2%
o 127908
18.2%

loan_amount
Real number (ℝ)

Distinct211
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean331117.74
Minimum16500
Maximum3576500
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2024-06-08T20:09:01.516249image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Quantile statistics

Minimum16500
5-th percentile106500
Q1196500
median296500
Q3436500
95-th percentile656500
Maximum3576500
Range3560000
Interquartile range (IQR)240000

Descriptive statistics

Standard deviation183909.31
Coefficient of variation (CV)0.55541968
Kurtosis9.1277753
Mean331117.74
Median Absolute Deviation (MAD)120000
Skewness1.6669981
Sum4.9227275 × 1010
Variance3.3822634 × 1010
MonotonicityNot monotonic
2024-06-08T20:09:02.103630image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
206500 4610
 
3.1%
256500 4079
 
2.7%
156500 3967
 
2.7%
226500 3944
 
2.7%
486500 3819
 
2.6%
306500 3691
 
2.5%
246500 3669
 
2.5%
216500 3649
 
2.5%
236500 3553
 
2.4%
266500 3543
 
2.4%
Other values (201) 110146
74.1%
ValueCountFrequency (%)
16500 3
 
< 0.1%
26500 27
 
< 0.1%
36500 119
 
0.1%
46500 212
 
0.1%
56500 810
 
0.5%
66500 859
 
0.6%
76500 1701
1.1%
86500 1605
1.1%
96500 1484
1.0%
106500 3210
2.2%
ValueCountFrequency (%)
3576500 1
 
< 0.1%
3346500 1
 
< 0.1%
3006500 4
< 0.1%
2986500 1
 
< 0.1%
2926500 1
 
< 0.1%
2706500 1
 
< 0.1%
2626500 1
 
< 0.1%
2606500 1
 
< 0.1%
2596500 1
 
< 0.1%
2506500 2
< 0.1%

rate_of_interest
Real number (ℝ)

MISSING 

Distinct131
Distinct (%)0.1%
Missing36439
Missing (%)24.5%
Infinite0
Infinite (%)0.0%
Mean4.0454758
Minimum0
Maximum8
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2024-06-08T20:09:02.668879image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3.125
Q13.625
median3.99
Q34.375
95-th percentile4.99
Maximum8
Range8
Interquartile range (IQR)0.75

Descriptive statistics

Standard deviation0.56139119
Coefficient of variation (CV)0.13877013
Kurtosis0.34456404
Mean4.0454758
Median Absolute Deviation (MAD)0.365
Skewness0.38840603
Sum454027.8
Variance0.31516007
MonotonicityNot monotonic
2024-06-08T20:09:03.262628image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3.99 14455
 
9.7%
3.625 8800
 
5.9%
3.875 8592
 
5.8%
3.75 8474
 
5.7%
3.5 6866
 
4.6%
4.5 6809
 
4.6%
4.375 6482
 
4.4%
4.25 6045
 
4.1%
4.125 5797
 
3.9%
4.75 4875
 
3.3%
Other values (121) 35036
23.6%
(Missing) 36439
24.5%
ValueCountFrequency (%)
0 1
 
< 0.1%
2.125 1
 
< 0.1%
2.25 4
 
< 0.1%
2.375 2
 
< 0.1%
2.475 2
 
< 0.1%
2.5 21
< 0.1%
2.575 1
 
< 0.1%
2.6 3
 
< 0.1%
2.625 25
< 0.1%
2.65 2
 
< 0.1%
ValueCountFrequency (%)
8 1
 
< 0.1%
7.75 1
 
< 0.1%
7.5 2
 
< 0.1%
7.375 1
 
< 0.1%
7.125 1
 
< 0.1%
7 1
 
< 0.1%
6.875 1
 
< 0.1%
6.75 5
< 0.1%
6.5 3
< 0.1%
6.375 1
 
< 0.1%

upfront_charges
Real number (ℝ)

MISSING  ZEROS 

Distinct58271
Distinct (%)53.4%
Missing39642
Missing (%)26.7%
Infinite0
Infinite (%)0.0%
Mean3224.9961
Minimum0
Maximum60000
Zeros20770
Zeros (%)14.0%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2024-06-08T20:09:03.808299image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1581.49
median2596.45
Q34812.5
95-th percentile9272.6885
Maximum60000
Range60000
Interquartile range (IQR)4231.01

Descriptive statistics

Standard deviation3251.1215
Coefficient of variation (CV)1.0081009
Kurtosis6.3685863
Mean3224.9961
Median Absolute Deviation (MAD)2108.66
Skewness1.7540757
Sum3.5161488 × 108
Variance10569791
MonotonicityNot monotonic
2024-06-08T20:09:04.618327image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 20770
 
14.0%
1250 1184
 
0.8%
1150 892
 
0.6%
795 487
 
0.3%
295 403
 
0.3%
950 192
 
0.1%
3000 173
 
0.1%
995 151
 
0.1%
4000 149
 
0.1%
5000 147
 
0.1%
Other values (58261) 84480
56.8%
(Missing) 39642
26.7%
ValueCountFrequency (%)
0 20770
14.0%
0.03 1
 
< 0.1%
0.06 1
 
< 0.1%
0.35 1
 
< 0.1%
0.6 1
 
< 0.1%
0.72 1
 
< 0.1%
0.75 1
 
< 0.1%
0.92 1
 
< 0.1%
1 12
 
< 0.1%
1.15 1
 
< 0.1%
ValueCountFrequency (%)
60000 1
< 0.1%
53485.78 1
< 0.1%
38437.5 1
< 0.1%
38375 1
< 0.1%
37604.38 1
< 0.1%
35192.5 1
< 0.1%
33268 1
< 0.1%
32850 1
< 0.1%
32825.25 1
< 0.1%
32647 1
< 0.1%

term
Real number (ℝ)

Distinct26
Distinct (%)< 0.1%
Missing41
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean335.13658
Minimum96
Maximum360
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2024-06-08T20:09:05.119021image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Quantile statistics

Minimum96
5-th percentile180
Q1360
median360
Q3360
95-th percentile360
Maximum360
Range264
Interquartile range (IQR)0

Descriptive statistics

Standard deviation58.409084
Coefficient of variation (CV)0.17428442
Kurtosis3.1732363
Mean335.13658
Median Absolute Deviation (MAD)0
Skewness-2.1748218
Sum49811015
Variance3411.621
MonotonicityNot monotonic
2024-06-08T20:09:05.637837image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
360 121685
81.8%
180 12981
 
8.7%
240 5859
 
3.9%
300 2822
 
1.9%
324 2766
 
1.9%
120 510
 
0.3%
144 263
 
0.2%
348 260
 
0.2%
336 213
 
0.1%
96 194
 
0.1%
Other values (16) 1076
 
0.7%
ValueCountFrequency (%)
96 194
 
0.1%
108 33
 
< 0.1%
120 510
 
0.3%
132 93
 
0.1%
144 263
 
0.2%
156 174
 
0.1%
165 1
 
< 0.1%
168 82
 
0.1%
180 12981
8.7%
192 17
 
< 0.1%
ValueCountFrequency (%)
360 121685
81.8%
348 260
 
0.2%
336 213
 
0.1%
324 2766
 
1.9%
322 1
 
< 0.1%
312 185
 
0.1%
300 2822
 
1.9%
288 90
 
0.1%
280 1
 
< 0.1%
276 100
 
0.1%

property_value
Real number (ℝ)

MISSING 

Distinct385
Distinct (%)0.3%
Missing15098
Missing (%)10.2%
Infinite0
Infinite (%)0.0%
Mean497893.47
Minimum8000
Maximum16508000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2024-06-08T20:09:06.200096image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Quantile statistics

Minimum8000
5-th percentile148000
Q1268000
median418000
Q3628000
95-th percentile1058000
Maximum16508000
Range16500000
Interquartile range (IQR)360000

Descriptive statistics

Standard deviation359935.32
Coefficient of variation (CV)0.72291633
Kurtosis73.221196
Mean497893.47
Median Absolute Deviation (MAD)170000
Skewness4.5862758
Sum6.6504626 × 1010
Variance1.2955343 × 1011
MonotonicityNot monotonic
2024-06-08T20:09:06.810333image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
308000 2792
 
1.9%
258000 2763
 
1.9%
358000 2679
 
1.8%
408000 2537
 
1.7%
328000 2524
 
1.7%
278000 2513
 
1.7%
268000 2497
 
1.7%
228000 2493
 
1.7%
238000 2408
 
1.6%
288000 2398
 
1.6%
Other values (375) 107968
72.6%
(Missing) 15098
 
10.2%
ValueCountFrequency (%)
8000 6
 
< 0.1%
18000 1
 
< 0.1%
28000 9
 
< 0.1%
38000 35
 
< 0.1%
48000 71
 
< 0.1%
58000 141
 
0.1%
68000 271
0.2%
78000 387
0.3%
88000 568
0.4%
98000 556
0.4%
ValueCountFrequency (%)
16508000 1
< 0.1%
12008000 1
< 0.1%
11008000 1
< 0.1%
10008000 1
< 0.1%
9268000 1
< 0.1%
8508000 1
< 0.1%
7608000 1
< 0.1%
6908000 1
< 0.1%
6508000 1
< 0.1%
6408000 1
< 0.1%

construction_type
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
sb
148637 
mh
 
33

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters297340
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowsb
2nd rowsb
3rd rowsb
4th rowsb
5th rowsb

Common Values

ValueCountFrequency (%)
sb 148637
> 99.9%
mh 33
 
< 0.1%

Length

2024-06-08T20:09:07.497319image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-06-08T20:09:08.098228image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
ValueCountFrequency (%)
sb 148637
> 99.9%
mh 33
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
s 148637
50.0%
b 148637
50.0%
m 33
 
< 0.1%
h 33
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 297340
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 148637
50.0%
b 148637
50.0%
m 33
 
< 0.1%
h 33
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 297340
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 148637
50.0%
b 148637
50.0%
m 33
 
< 0.1%
h 33
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 297340
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 148637
50.0%
b 148637
50.0%
m 33
 
< 0.1%
h 33
 
< 0.1%

income
Real number (ℝ)

MISSING 

Distinct1001
Distinct (%)0.7%
Missing9150
Missing (%)6.2%
Infinite0
Infinite (%)0.0%
Mean6957.3389
Minimum0
Maximum578580
Zeros1260
Zeros (%)0.8%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2024-06-08T20:09:08.608274image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1920
Q13720
median5760
Q38520
95-th percentile15420
Maximum578580
Range578580
Interquartile range (IQR)4800

Descriptive statistics

Standard deviation6496.5864
Coefficient of variation (CV)0.93377461
Kurtosis885.29246
Mean6957.3389
Median Absolute Deviation (MAD)2280
Skewness17.307695
Sum9.7068792 × 108
Variance42205635
MonotonicityNot monotonic
2024-06-08T20:09:09.033354image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1260
 
0.8%
3600 1250
 
0.8%
4200 1243
 
0.8%
4800 1191
 
0.8%
3120 1168
 
0.8%
3720 1161
 
0.8%
3900 1159
 
0.8%
5400 1152
 
0.8%
3300 1144
 
0.8%
4500 1139
 
0.8%
Other values (991) 127653
85.9%
(Missing) 9150
 
6.2%
ValueCountFrequency (%)
0 1260
0.8%
60 5
 
< 0.1%
120 12
 
< 0.1%
180 12
 
< 0.1%
240 15
 
< 0.1%
300 18
 
< 0.1%
360 11
 
< 0.1%
420 15
 
< 0.1%
480 11
 
< 0.1%
540 17
 
< 0.1%
ValueCountFrequency (%)
578580 1
< 0.1%
377220 1
< 0.1%
374400 1
< 0.1%
335880 2
< 0.1%
329460 1
< 0.1%
322860 1
< 0.1%
312000 1
< 0.1%
240000 1
< 0.1%
235980 1
< 0.1%
198060 1
< 0.1%

credit_type
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
CIB
48152 
CRIF
43901 
EXP
41319 
EQUI
15298 

Length

Max length4
Median length3
Mean length3.3981906
Min length3

Characters and Unicode

Total characters505209
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEXP
2nd rowEQUI
3rd rowEXP
4th rowEXP
5th rowCRIF

Common Values

ValueCountFrequency (%)
CIB 48152
32.4%
CRIF 43901
29.5%
EXP 41319
27.8%
EQUI 15298
 
10.3%

Length

2024-06-08T20:09:09.432627image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-06-08T20:09:09.906777image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
ValueCountFrequency (%)
cib 48152
32.4%
crif 43901
29.5%
exp 41319
27.8%
equi 15298
 
10.3%

Most occurring characters

ValueCountFrequency (%)
I 107351
21.2%
C 92053
18.2%
E 56617
11.2%
B 48152
9.5%
R 43901
8.7%
F 43901
8.7%
X 41319
 
8.2%
P 41319
 
8.2%
Q 15298
 
3.0%
U 15298
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 505209
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 107351
21.2%
C 92053
18.2%
E 56617
11.2%
B 48152
9.5%
R 43901
8.7%
F 43901
8.7%
X 41319
 
8.2%
P 41319
 
8.2%
Q 15298
 
3.0%
U 15298
 
3.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 505209
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 107351
21.2%
C 92053
18.2%
E 56617
11.2%
B 48152
9.5%
R 43901
8.7%
F 43901
8.7%
X 41319
 
8.2%
P 41319
 
8.2%
Q 15298
 
3.0%
U 15298
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 505209
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 107351
21.2%
C 92053
18.2%
E 56617
11.2%
B 48152
9.5%
R 43901
8.7%
F 43901
8.7%
X 41319
 
8.2%
P 41319
 
8.2%
Q 15298
 
3.0%
U 15298
 
3.0%

credit_score
Real number (ℝ)

Distinct401
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean699.7891
Minimum500
Maximum900
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2024-06-08T20:09:10.359711image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Quantile statistics

Minimum500
5-th percentile519
Q1599
median699
Q3800
95-th percentile881
Maximum900
Range400
Interquartile range (IQR)201

Descriptive statistics

Standard deviation115.87586
Coefficient of variation (CV)0.16558683
Kurtosis-1.2026494
Mean699.7891
Median Absolute Deviation (MAD)100
Skewness0.004766757
Sum1.0403765 × 108
Variance13427.214
MonotonicityNot monotonic
2024-06-08T20:09:10.871383image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
763 415
 
0.3%
867 413
 
0.3%
639 411
 
0.3%
581 408
 
0.3%
554 407
 
0.3%
519 406
 
0.3%
737 406
 
0.3%
890 406
 
0.3%
687 405
 
0.3%
617 405
 
0.3%
Other values (391) 144588
97.3%
ValueCountFrequency (%)
500 357
0.2%
501 357
0.2%
502 346
0.2%
503 383
0.3%
504 392
0.3%
505 379
0.3%
506 380
0.3%
507 386
0.3%
508 400
0.3%
509 348
0.2%
ValueCountFrequency (%)
900 393
0.3%
899 352
0.2%
898 370
0.2%
897 383
0.3%
896 391
0.3%
895 371
0.2%
894 361
0.2%
893 348
0.2%
892 366
0.2%
891 376
0.3%

age
Categorical

Distinct7
Distinct (%)< 0.1%
Missing200
Missing (%)0.1%
Memory size1.1 MiB
45-54
34720 
35-44
32818 
55-64
32534 
65-74
20744 
25-34
19142 
Other values (2)
8512 

Length

Max length5
Median length5
Mean length4.8853371
Min length3

Characters and Unicode

Total characters725326
Distinct characters9
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row25-34
2nd row55-64
3rd row35-44
4th row45-54
5th row25-34

Common Values

ValueCountFrequency (%)
45-54 34720
23.4%
35-44 32818
22.1%
55-64 32534
21.9%
65-74 20744
14.0%
25-34 19142
12.9%
>74 7175
 
4.8%
<25 1337
 
0.9%
(Missing) 200
 
0.1%

Length

2024-06-08T20:09:11.319923image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-06-08T20:09:11.651492image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
ValueCountFrequency (%)
45-54 34720
23.4%
35-44 32818
22.1%
55-64 32534
21.9%
65-74 20744
14.0%
25-34 19142
12.9%
74 7175
 
4.8%
25 1337
 
0.9%

Most occurring characters

ValueCountFrequency (%)
4 214671
29.6%
5 208549
28.8%
- 139958
19.3%
6 53278
 
7.3%
3 51960
 
7.2%
7 27919
 
3.8%
2 20479
 
2.8%
> 7175
 
1.0%
< 1337
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 576856
79.5%
Dash Punctuation 139958
 
19.3%
Math Symbol 8512
 
1.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 214671
37.2%
5 208549
36.2%
6 53278
 
9.2%
3 51960
 
9.0%
7 27919
 
4.8%
2 20479
 
3.6%
Math Symbol
ValueCountFrequency (%)
> 7175
84.3%
< 1337
 
15.7%
Dash Punctuation
ValueCountFrequency (%)
- 139958
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 725326
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 214671
29.6%
5 208549
28.8%
- 139958
19.3%
6 53278
 
7.3%
3 51960
 
7.2%
7 27919
 
3.8%
2 20479
 
2.8%
> 7175
 
1.0%
< 1337
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 725326
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 214671
29.6%
5 208549
28.8%
- 139958
19.3%
6 53278
 
7.3%
3 51960
 
7.2%
7 27919
 
3.8%
2 20479
 
2.8%
> 7175
 
1.0%
< 1337
 
0.2%

ltv
Real number (ℝ)

MISSING  SKEWED 

Distinct8484
Distinct (%)6.4%
Missing15098
Missing (%)10.2%
Infinite0
Infinite (%)0.0%
Mean72.746457
Minimum0.9674782
Maximum7831.25
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2024-06-08T20:09:12.010115image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Quantile statistics

Minimum0.9674782
5-th percentile36.350575
Q160.47486
median75.13587
Q386.184211
95-th percentile98.728814
Maximum7831.25
Range7830.2825
Interquartile range (IQR)25.70935

Descriptive statistics

Standard deviation39.967603
Coefficient of variation (CV)0.54940961
Kurtosis19979.045
Mean72.746457
Median Absolute Deviation (MAD)12.514733
Skewness120.61534
Sum9716889.8
Variance1597.4093
MonotonicityNot monotonic
2024-06-08T20:09:12.640309image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
81.25 530
 
0.4%
91.66666667 499
 
0.3%
80.03875969 380
 
0.3%
80.03246753 328
 
0.2%
94.95614035 322
 
0.2%
78.84615385 317
 
0.2%
78.64583333 310
 
0.2%
79.04040404 309
 
0.2%
80.06329114 309
 
0.2%
95.16806723 306
 
0.2%
Other values (8474) 129962
87.4%
(Missing) 15098
 
10.2%
ValueCountFrequency (%)
0.967478198 1
< 0.1%
2.072942643 1
< 0.1%
2.767587397 1
< 0.1%
2.81374502 1
< 0.1%
2.856420627 1
< 0.1%
2.992584746 1
< 0.1%
3.083554377 1
< 0.1%
3.125 1
< 0.1%
3.74668435 1
< 0.1%
3.875171468 1
< 0.1%
ValueCountFrequency (%)
7831.25 1
< 0.1%
6706.25 1
< 0.1%
5206.25 1
< 0.1%
4706.25 1
< 0.1%
2956.25 1
< 0.1%
2331.25 1
< 0.1%
263.5416667 1
< 0.1%
237.5 2
< 0.1%
220.3629032 1
< 0.1%
201.7857143 1
< 0.1%

region
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
North
74722 
south
64016 
central
8697 
North-East
 
1235

Length

Max length10
Median length5
Mean length5.1585323
Min length5

Characters and Unicode

Total characters766919
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowsouth
2nd rowNorth
3rd rowsouth
4th rowNorth
5th rowNorth

Common Values

ValueCountFrequency (%)
North 74722
50.3%
south 64016
43.1%
central 8697
 
5.8%
North-East 1235
 
0.8%

Length

2024-06-08T20:09:13.298249image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-06-08T20:09:13.758094image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
ValueCountFrequency (%)
north 74722
50.3%
south 64016
43.1%
central 8697
 
5.8%
north-east 1235
 
0.8%

Most occurring characters

ValueCountFrequency (%)
t 149905
19.5%
o 139973
18.3%
h 139973
18.3%
r 84654
11.0%
N 75957
9.9%
s 65251
8.5%
u 64016
8.3%
a 9932
 
1.3%
c 8697
 
1.1%
e 8697
 
1.1%
Other values (4) 19864
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 688492
89.8%
Uppercase Letter 77192
 
10.1%
Dash Punctuation 1235
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 149905
21.8%
o 139973
20.3%
h 139973
20.3%
r 84654
12.3%
s 65251
9.5%
u 64016
9.3%
a 9932
 
1.4%
c 8697
 
1.3%
e 8697
 
1.3%
n 8697
 
1.3%
Uppercase Letter
ValueCountFrequency (%)
N 75957
98.4%
E 1235
 
1.6%
Dash Punctuation
ValueCountFrequency (%)
- 1235
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 765684
99.8%
Common 1235
 
0.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 149905
19.6%
o 139973
18.3%
h 139973
18.3%
r 84654
11.1%
N 75957
9.9%
s 65251
8.5%
u 64016
8.4%
a 9932
 
1.3%
c 8697
 
1.1%
e 8697
 
1.1%
Other values (3) 18629
 
2.4%
Common
ValueCountFrequency (%)
- 1235
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 766919
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 149905
19.5%
o 139973
18.3%
h 139973
18.3%
r 84654
11.0%
N 75957
9.9%
s 65251
8.5%
u 64016
8.3%
a 9932
 
1.3%
c 8697
 
1.1%
e 8697
 
1.1%
Other values (4) 19864
 
2.6%

status
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
0
112031 
1
36639 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters148670
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 112031
75.4%
1 36639
 
24.6%

Length

2024-06-08T20:09:14.248282image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-06-08T20:09:14.657624image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
ValueCountFrequency (%)
0 112031
75.4%
1 36639
 
24.6%

Most occurring characters

ValueCountFrequency (%)
0 112031
75.4%
1 36639
 
24.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 148670
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 112031
75.4%
1 36639
 
24.6%

Most occurring scripts

ValueCountFrequency (%)
Common 148670
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 112031
75.4%
1 36639
 
24.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 148670
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 112031
75.4%
1 36639
 
24.6%

Interactions

2024-06-08T20:08:43.431867image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:18.925976image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:22.944621image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:26.324145image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:29.352316image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:32.989704image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:36.334937image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:40.230666image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:43.981093image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:19.629434image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:23.330066image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:26.674556image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:29.822254image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:33.275970image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:36.803824image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:40.599442image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:44.485054image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:20.140432image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:23.742789image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:27.078654image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:30.284390image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:33.584227image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:37.537928image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:40.915792image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:45.199723image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:20.773133image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:24.209403image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:27.519274image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:30.856862image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:34.046578image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:38.240418image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:41.392777image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:45.677813image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:21.217709image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:24.565697image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:27.764078image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:31.168714image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:34.400280image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:38.641886image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:41.661096image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:46.324303image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:21.638087image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:24.946786image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:28.058010image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:31.636790image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:34.865028image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:39.004957image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:42.080429image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:47.084656image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:22.069530image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:25.358550image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:28.482162image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:32.113226image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:35.307787image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:39.587152image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:42.530883image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:47.817239image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:22.549948image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:25.917008image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:28.875832image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:32.682483image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:35.763350image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:39.923641image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-06-08T20:08:42.978219image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Missing values

2024-06-08T20:08:48.828933image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
A simple visualization of nullity by column.
2024-06-08T20:08:52.968549image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-06-08T20:08:55.857286image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

genderapprov_in_advloan_typeloan_purposebusiness_or_commercialloan_amountrate_of_interestupfront_chargestermproperty_valueconstruction_typeincomecredit_typecredit_scoreageltvregionstatus
0Sex Not Availablenopretype1p1nob/c116500NaNNaN360.0118000.0sb1740.0EXP75825-3498.728814south1
1Malenopretype2p1b/c206500NaNNaN360.0NaNsb4980.0EQUI55255-64NaNNorth1
2Malepretype1p1nob/c4065004.560595.00360.0508000.0sb9480.0EXP83435-4480.019685south0
3Malenopretype1p4nob/c4565004.250NaN360.0658000.0sb11880.0EXP58745-5469.376900North0
4Jointpretype1p1nob/c6965004.0000.00360.0758000.0sb10440.0CRIF60225-3491.886544North0
5Jointpretype1p1nob/c7065003.990370.00360.01008000.0sb10080.0EXP86435-4470.089286North0
6Jointpretype1p3nob/c3465004.5005120.00360.0438000.0sb5040.0EXP86055-6479.109589North0
7Femalenopretype1p4nob/c2665004.1255609.88360.0308000.0sb3780.0CIB86355-6486.525974North0
8Jointnopretype1p3nob/c3765004.8751150.00360.0478000.0sb5580.0CIB58055-6478.765690central0
9Sex Not Availablenopretype3p3nob/c4365003.4902316.50360.0688000.0sb6720.0CIB78855-6463.444767south0
genderapprov_in_advloan_typeloan_purposebusiness_or_commercialloan_amountrate_of_interestupfront_chargestermproperty_valueconstruction_typeincomecredit_typecredit_scoreageltvregionstatus
148660Femalenopretype1p4nob/c3665003.8753643.16360.0658000.0sb7200.0CIB85145-5455.699088North0
148661Sex Not Availablenopretype2p4b/c346500NaNNaN360.0358000.0sbNaNEXP58525-3496.787710south1
148662Jointnopretype1p4nob/c6465003.6257639.80360.0828000.0sb13500.0CIB87345-5478.079710North0
148663Malenopretype2p1b/c106500NaNNaN360.0NaNsb1860.0EQUI619<25NaNNorth1
148664Jointnopretype2p1b/c1565003.9903113.06360.0158000.0sb4020.0EXP85965-7499.050633central0
148665Sex Not Availablenopretype1p3nob/c4365003.1259960.00180.0608000.0sb7860.0CIB65955-6471.792763south0
148666Malenopretype1p1nob/c5865005.1900.00360.0788000.0sb7140.0CIB56925-3474.428934south0
148667Malenopretype1p4nob/c4465003.1251226.64180.0728000.0sb6900.0CIB70245-5461.332418North0
148668Femalenopretype1p4nob/c1965003.5004323.33180.0278000.0sb7140.0EXP73755-6470.683453North0
148669Femalenopretype1p3nob/c4065004.3756000.00240.0558000.0sb7260.0CIB83045-5472.849462North0

Duplicate rows

Most frequently occurring

genderapprov_in_advloan_typeloan_purposebusiness_or_commercialloan_amountrate_of_interestupfront_chargestermproperty_valueconstruction_typeincomecredit_typecredit_scoreageltvregionstatus# duplicates
0Malenopretype2p4b/c236500NaNNaN360.0248000.0sb3120.0CRIF67335-4495.362903central12